Skip to content

CLDSRV-732: Fix flakiness starting cloudserver in CI#5918

Merged
bert-e merged 1 commit intodevelopment/9.0from
bugfix/CLDSRV-732-nyc-flakiness
Aug 26, 2025
Merged

CLDSRV-732: Fix flakiness starting cloudserver in CI#5918
bert-e merged 1 commit intodevelopment/9.0from
bugfix/CLDSRV-732-nyc-flakiness

Conversation

@BourgoisMickael
Copy link
Copy Markdown
Contributor

The CI fails to wait for S3 after 40s recently

s3-stderr has those logs:

npm error code E403
npm error 403 403 Forbidden - GET https://registry.npmjs.org/nyc
npm error 403 In most cases, you or one of your dependencies are requesting
npm error 403 a package version that is forbidden by your security policy, or
npm error 403 on a server you do not have access to.
npm error A complete log of this run can be found in: /root/.npm/_logs/2025-08-26T12_00_55_746Z-debug-0.log

Build another image to include nyc and a command for coverage during tests

@bert-e
Copy link
Copy Markdown
Contributor

bert-e commented Aug 26, 2025

Hello bourgoismickael,

My role is to assist you with the merge of this
pull request. Please type @bert-e help to get information
on this process, or consult the user documentation.

Available options
name description privileged authored
/after_pull_request Wait for the given pull request id to be merged before continuing with the current one.
/bypass_author_approval Bypass the pull request author's approval
/bypass_build_status Bypass the build and test status
/bypass_commit_size Bypass the check on the size of the changeset TBA
/bypass_incompatible_branch Bypass the check on the source branch prefix
/bypass_jira_check Bypass the Jira issue check
/bypass_peer_approval Bypass the pull request peers' approval
/bypass_leader_approval Bypass the pull request leaders' approval
/approve Instruct Bert-E that the author has approved the pull request. ✍️
/create_pull_requests Allow the creation of integration pull requests.
/create_integration_branches Allow the creation of integration branches.
/no_octopus Prevent Wall-E from doing any octopus merge and use multiple consecutive merge instead
/unanimity Change review acceptance criteria from one reviewer at least to all reviewers
/wait Instruct Bert-E not to run until further notice.
Available commands
name description privileged
/help Print Bert-E's manual in the pull request.
/status Print Bert-E's current status in the pull request TBA
/clear Remove all comments from Bert-E from the history TBA
/retry Re-start a fresh build TBA
/build Re-start a fresh build TBA
/force_reset Delete integration branches & pull requests, and restart merge process from the beginning.
/reset Try to remove integration branches unless there are commits on them which do not appear on the source branch.

Status report is not available.

@bert-e
Copy link
Copy Markdown
Contributor

bert-e commented Aug 26, 2025

Incorrect fix version

The Fix Version/s in issue CLDSRV-732 contains:

  • 9.0.23

Considering where you are trying to merge, I ignored possible hotfix versions and I expected to find:

  • 9.0.23

  • 9.1.0

Please check the Fix Version/s of CLDSRV-732, or the target
branch of this pull request.

@codecov
Copy link
Copy Markdown

codecov bot commented Aug 26, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.19%. Comparing base (0cbe863) to head (0323321).
⚠️ Report is 9 commits behind head on development/9.0.
✅ All tests successful. No failed tests found.

Additional details and impacted files

Impacted file tree graph

@@               Coverage Diff                @@
##           development/9.0    #5918   +/-   ##
================================================
  Coverage            83.19%   83.19%           
================================================
  Files                  188      188           
  Lines                12093    12093           
================================================
  Hits                 10061    10061           
  Misses                2032     2032           
Flag Coverage Δ
ceph-backend-test 65.65% <ø> (-0.02%) ⬇️
file-ft-tests 66.03% <ø> (ø)
kmip-ft-tests 26.83% <ø> (ø)
mongo-v0-ft-tests 67.88% <ø> (-0.03%) ⬇️
mongo-v1-ft-tests 67.88% <ø> (ø)
multiple-backend 35.30% <ø> (ø)
quota-tests 32.05% <ø> (-0.04%) ⬇️
quota-tests-inflights 34.03% <ø> (ø)
unit 67.28% <ø> (ø)
utapi-v2-tests 33.20% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@BourgoisMickael BourgoisMickael force-pushed the bugfix/CLDSRV-732-nyc-flakiness branch from 12b0ced to 75be1e5 Compare August 26, 2025 13:35
The CI fails to wait for S3 after 40s recently

s3-stderr has those logs:
```
npm error code E403
npm error 403 403 Forbidden - GET https://registry.npmjs.org/nyc
npm error 403 In most cases, you or one of your dependencies are requesting
npm error 403 a package version that is forbidden by your security policy, or
npm error 403 on a server you do not have access to.
npm error A complete log of this run can be found in: /root/.npm/_logs/2025-08-26T12_00_55_746Z-debug-0.log
```

Build another image to include nyc and a command for coverage during tests
@BourgoisMickael BourgoisMickael force-pushed the bugfix/CLDSRV-732-nyc-flakiness branch from 75be1e5 to 0323321 Compare August 26, 2025 14:20
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses CI flakiness by creating a dedicated Docker image for test coverage that includes the nyc package pre-installed, eliminating runtime npm install failures that were causing 40-second timeouts.

Key changes:

  • Creates a multi-stage Dockerfile with separate production and test coverage images
  • Moves coverage script logic from inline bash to a dedicated shell script
  • Updates all CI jobs to use the new test coverage image variant

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
docker-test-with-coverage.sh New script containing coverage generation logic previously embedded in docker-compose
Dockerfile Adds multi-stage build with testcoverage stage that pre-installs nyc
.github/workflows/tests.yaml Updates all test jobs to use testcoverage image and adds build step for it
.github/workflows/release.yaml Adds build and push step for testcoverage image in release workflow
.github/docker/docker-compose.yaml Removes inline coverage script now handled by dedicated image

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment thread docker-test-with-coverage.sh
Comment on lines +9 to +10
kill -TERM $PID 2>/dev/null || true
wait $PID
Copy link

Copilot AI Aug 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a potential race condition where the process might exit before wait $PID is called, causing the wait to fail. Consider checking if the process is still running before waiting.

Suggested change
kill -TERM $PID 2>/dev/null || true
wait $PID
if kill -0 $PID 2>/dev/null; then
wait $PID
fi

Copilot uses AI. Check for mistakes.
@bert-e
Copy link
Copy Markdown
Contributor

bert-e commented Aug 26, 2025

Request integration branches

Waiting for integration branch creation to be requested by the user.

To request integration branches, please comment on this pull request with the following command:

/create_integration_branches

Alternatively, the /approve and /create_pull_requests commands will automatically
create the integration branches.

@BourgoisMickael BourgoisMickael changed the title CLDSRV-732: Fix flakiness starting cloudserver CLDSRV-732: Fix flakiness starting cloudserver in CI Aug 26, 2025
@BourgoisMickael BourgoisMickael requested review from a team, DarkIsDude, davidtencer, francoisferrand, fredmnl and welansari and removed request for a team August 26, 2025 15:00
@BourgoisMickael
Copy link
Copy Markdown
Contributor Author

/create_integration_branches

@bert-e
Copy link
Copy Markdown
Contributor

bert-e commented Aug 26, 2025

Integration data created

I have created the integration data for the additional destination branches.

The following branches will NOT be impacted:

  • development/7.10
  • development/7.4
  • development/7.70
  • development/8.8

You can set option create_pull_requests if you need me to create
integration pull requests in addition to integration branches, with:

@bert-e create_pull_requests

The following options are set: create_integration_branches

@bert-e
Copy link
Copy Markdown
Contributor

bert-e commented Aug 26, 2025

Waiting for approval

The following approvals are needed before I can proceed with the merge:

  • the author

  • 2 peers

The following options are set: create_integration_branches

Copy link
Copy Markdown

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, I'm just not sure we need to change the release workflow here

cache-from: type=gha
cache-to: type=gha,mode=max

- name: Build and push test coverage image
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really want to create a new tagged image just for the coverage?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's there so you can run the docker compose from the CI for a specific releases version with coverage.

Copy link
Copy Markdown
Contributor Author

@BourgoisMickael BourgoisMickael Aug 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed we could skip coverage outside CI when targetting a specific version.
But the image is small, there's only 1 additinal layer for nyc and a change of CMD, that's 15s build time

@BourgoisMickael
Copy link
Copy Markdown
Contributor Author

/approve

@bert-e
Copy link
Copy Markdown
Contributor

bert-e commented Aug 26, 2025

Build failed

The build for commit did not succeed in branch w/9.1/bugfix/CLDSRV-732-nyc-flakiness

The following options are set: approve, create_integration_branches

Comment thread .github/workflows/tests.yaml
Comment thread .github/workflows/release.yaml
Comment thread Dockerfile
@bert-e
Copy link
Copy Markdown
Contributor

bert-e commented Aug 26, 2025

In the queue

The changeset has received all authorizations and has been added to the
relevant queue(s). The queue(s) will be merged in the target development
branch(es) as soon as builds have passed.

The changeset will be merged in:

  • ✔️ development/9.0

  • ✔️ development/9.1

The following branches will NOT be impacted:

  • development/7.10
  • development/7.4
  • development/7.70
  • development/8.8

There is no action required on your side. You will be notified here once
the changeset has been merged. In the unlikely event that the changeset
fails permanently on the queue, a member of the admin team will
contact you to help resolve the matter.

IMPORTANT

Please do not attempt to modify this pull request.

  • Any commit you add on the source branch will trigger a new cycle after the
    current queue is merged.
  • Any commit you add on one of the integration branches will be lost.

If you need this pull request to be removed from the queue, please contact a
member of the admin team now.

The following options are set: approve, create_integration_branches

@bert-e
Copy link
Copy Markdown
Contributor

bert-e commented Aug 26, 2025

Queue build failed

The corresponding build for the queue failed:

  • Checkout the status page.
  • Identify the failing build and review the logs.
  • If no issue is found, re-run the build.
  • If an issue is identified, checkout the steps below to remove
    the pull request from the queue for further analysis and maybe rebase/merge.
Remove the pull request from the queue
  • Add a /wait comment on this pull request.
  • Click on login on the status page.
  • Go into the manage page.
  • Find the option called Rebuild the queue and click on it.
    Bert-E will loop again on all pull requests to put the valid ones
    in the queue again, while skipping the one with the /wait comment.
  • Wait for the new queue to merge, then merge/rebase your pull request
    with the latest changes to then work on a proper fix.
  • Once the issue is fixed, delete the /wait comment and
    follow the usual process to merge the pull request.

@bert-e
Copy link
Copy Markdown
Contributor

bert-e commented Aug 26, 2025

I have successfully merged the changeset of this pull request
into targetted development branches:

  • ✔️ development/9.0

  • ✔️ development/9.1

The following branches have NOT changed:

  • development/7.10
  • development/7.4
  • development/7.70
  • development/8.8

Please check the status of the associated issue CLDSRV-732.

Goodbye bourgoismickael.

@bert-e bert-e merged commit 1702399 into development/9.0 Aug 26, 2025
27 checks passed
@bert-e bert-e deleted the bugfix/CLDSRV-732-nyc-flakiness branch August 26, 2025 17:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants